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Abstract 

Machupo virus (MACV) is a New World arenavirus that is currently emerging from 
a rodent reservoir into the human population. For New World arenaviruses, it is known 
that transferrin receptor binding is a major determinate of host range variation. To 
better understand the relationship between the virus and its human host, we used a 
structure-based computational model to interrogate the effects of mutations in the in- 
terface between the MACV spike glycoprotein (MACVGP) and the Human transferrin 
receptor (hTfRl). Existing computational methods to probe protein-protein interac- 
tions perform poorly with non-alanine mutations and on the flexible loops encountered 
in many host-virus interactions. We applied steered molecular dynamics (SMD) sim- 
ulations to induce protein dissociation; using descriptive statistics, we were able to 
differentiate mutant complexes and biochemically rationalize available infectivity data. 
The previously published co-crystal structure implies two important hydrogen bonding 
networks in the MACVGP/hTfRl interface; our simulations confirm one of them is 
critical for viral binding. Also, since mutations in polar residues in another putative 
network do not change viral binding, we can conclude the second network is likely an 
artifact of crystallization. Finally, we found a viral site known to be critical for in- 
fection, may mark an important evolutionary suppressor site for infection-diminishing 
hTfRl mutants. Taken together, our computational data rationalize many of the avail- 
able experimental data, further refine our biochemical understanding of this system, 
and point to the most likely next evolutionary step for MACV. 



Introduction 

In the viral life cycle, cellular fusion is the first and perhaps most critical step. The protein- 
protein interaction at the host-virus interface represents an upper limit to viral evolvability: 
If the host evolves to completely block binding, it escapes infection in perpetuity. Despite 
the importance of this interaction, there is a surprising lack of effective computational tools 
to make predictions for viral mutations. 

MACV is a clade B New World arenavirus that presents clinically with a dramatic hemor- 



rhagic fever and high fatality rate (Radoshitsky et al. 2008). It is classified as a US National 
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Institute of Allergy and Infectious Disease Category A Priority Pathogen and is listed among 
US Select Agents and Risk Group 4 Pathogens. Although the virus cannot efficiently trans- 
fer among humans, there is a substantial risk to public health from natural and artificial 
selection for improved infectivity. Evolutionary data suggests the viral binding event is the 
critical bottleneck for efficient mammalian infection, and in many cases can be used as a 



proxy for infectivity (Radoshitsky et al. 2011). Further, it is known that several targeted 
mutations in the hTfRl can completely block infection in various South American rodent 
species. However, the exact biochemical mechanism of rodent resistance is an unresolved 
problem. 

The transferrin receptor is the primary receptor used by MACV for fusion and invasion. 
Cellular entry is mediated by the MACVGP binding to the apical domain of hTfRl. While 
hTfRl is used by the host for iron-scavenging, all such portions of the receptor are isolated 
on the basilar, membrane-associated domain; without any obvious function, the purpose 
of the hTfRl apical domain remains an open question. According to a published co-crystal 



structure a 15 amino acid flexible loop on MACVGP is responsible for viral binding (Abraham 



et al.||2010[ [Radoshitsky et al.||2011[ ), and many of the interface interactions are mediated by 



extensive hydrogen bond networks (Abraham et al. 2010). Alanine-scanning and whole cell 
infectivity assays have identifled several sites in both MACVGP and hTfRl that are probably 



critical for establishing infection (Choe et al. 2011 Radoshitsky et al. 2011). 



Inspired by the relatively large canon of experimental data (Abraham et al. 2010 Choe 



et al. 2011; Radoshitsky et al. 2011), we applied an all-atom molecular dynamics method to 



computationally interrogate a number of receptor and viral mutants; as a proxy for relative 
binding affinity, we were able to use several simpler descriptive statistics. In WT and mutant 
complexes, after inducing dissocation of MACVGP from a stationary hTfRl, we could resolve 
relative differences in unbinding statistics and predict significant affinity changes. At sites 
known to be important for successful viral infection, we tested many additional mutations; 
we found that the mechanistic cause of the phenotypic effect is often not as simple as the 
static structure suggests. We show mutations at Tyr211 to small, nonpolar amino acids will 
produce a disruption in viral fusion. Also, although Asn348 in hTfRl is important for viral 
fusion, mutations there are not likely to affect viral entry. Moreover, our computational data 
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suggests a mutation to alanine at site 111 in MACVGP may act as a major supressor to 
hydrophobic, infection-blocking mutations at Tyr211 in hTfRl. Finally, our study offers an 
all-atom steered molecular dynamic approach to avoid the pitfalls of several existing methods 
used to evaluate mutations in host-virus protein-protein interfaces. 



Materials and Methods 



System Modeling 



We used the protein visualization software PyMOL (Schrodinger 2010) to remove residues 
121-190, 301-329, and 383-756 in the hTfRl; no residues were removed from the viral protein. 
We also removed 10 sugar residues from the native structure. For the purposes of notation, 
a 'v' will always precede the MACVGP site. 



After system reduction, the Visual Molecular Dynamics (VMD) (Humphrey et al. 1996) 
package along with its system of back-ends was used for all subsequent modeling. The Orient 
add-on package allowed us to rotate the system axis such that the direction of steering was 
oriented directly down the z-axis. De-glycosylation simplified the system such that Autopsf 
could easily find the chain terminations and patch them appropriately. The Solvate package 
was used to generate a TIP3P water model with a 5 Angstrom buffer (relative to the maximum 
dimensions of the proteins) on all sides except down the positive z-axis where a 20 Angstrom 
buffer was created. Finally, we used the Autoionize package to place 150 millimolar NaCl 
and neutralize the total system charge. In the end, each modeled system had approximately 
28,000 atoms. 



Equilibration 



NAMD was used for all simulations in this study (Phillips et al. 2005). In addition to the 
modeled system, for equilibration we generated a file with a fixed a-carbon backbone; this 
was accomplished by setting /3 = 1. Further, we generated a file with fixed a-carbon atoms 
at residues 41-92 (numbered linearly, in this case, starting at 1 for the first amino acid as was 
required for VMD) in the hTfRl. The second file was used to affix a harmonic restraint thus 
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preventing any unfolding due to system reduction. More importantly, the harmonic restraint 
allowed the protein complex to equilibrate while preventing any drift from its predefined 
orientation. Finally, we calculated the system center and dimensions for use in molecular 
dynamics settings. 



We used the Charmm27 (Brooks et al. 1983) all atom force field. The initial system 



temperature was set to 310K. Several typical MD settings were used including switching 
and cutoff; in addition, we used a 2 femtosecond time step with rigid bonds (the exact 
configuration is provided in the supplement). We used periodic boundary conditions with 
the particle mesh ewald (PME) method of computing full system electrostatics outside of the 
explicit box. Furthermore, we used a group pressure cell, flexible box, and langevin piston 
during equilibration. Finally, a harmonic restraint (called harmonic constraint in VMD) was 
set as stated previously. 

To start the simulation, the langevin piston was switched off and the system was mini- 
mized for 1000 steps. Next, the fixed backbone was released, and the system was minimized 
for an additional 1000 time steps. Subsequently, the system was released into all-atom molec- 
ular dynamics for 3000 steps. Finally, the langevin piston was turned on and the system was 
simulated for 2 ns (1,000,000 steps) of chemical time. For each mutant, twenty independent 
replicates were run with an identical protocol. 

Steered Molecular Dynamics 

Following equilibration, the final state of each simulation was used to generate a file fixing the 
a-carbon on residues 1, 58, 73-83, 96, 136, 137, 138, and 161 (again with linear numbering) in 
the hTfRl. Also, the a-carbon of all residues (163-318 in linear numbering) in the MACVGP 
were given an occupancy of 1, indicating they had an applied force during the simulation. 
In every way possible the subsequent steering simulation had the same parameters as 



the equilibration simulation. Thus, the steered molecular dynamics (SMD) (Isralewitz et al. 



2001a) simulation was a typical restart with slightly different settings. Periodic boundary 
conditions were incorporated as part of the system and PME was again used to approximate 
full system electrostatics. In addition, we first tuned the pulling velocity for practical time 
considerations. Subsequently, the spring constant was adjusted to ensure sufficient signal to 



noise with the given velocity. As a result, the spring constant was set to 5 (units), velocity 
set to 2 (units), and direction down the positive 2;-axis. 

Finally, SMD was run for 15 ns (7,500,000) of chemical time. For each simulation, we 
randomly selected one of the equilibration runs for restart; per mutant, all of the results 
presented here are from 50 such replicates. All MACVGP/hTfRl complexes separated by 
greater than 4 Angstroms and many separated to 10 or more. As Figure 1 indicates, 4 A 
separation was more than sufficient to compare the point mutant variation in most cases. 

To leave the final trajectory of a tractable size, only 1000 evenly space frames were retained 
from each simulation, leaving a final trajectory size of 323 MB. See the supplemental movie for 
a representative unbinding trajectory. All SMD simulations were performed on the Hrothgar 
cluster at Texas Tech University using NAMD 2.9. Each simulation was parallelized over 
60 computational cores and utilized approximately 20 hours of computing time. The total 
chemical time simulated for this project was nearly 10 /is, requiring slightly over 1 million 
cpu-hours. 



Post-processing 



The python packages MDAnalysis (Michaud-Agrawal et al. 2011) and ProDy (Bakan et al. 



2011) were both used at various points in post-processing. The molecular trajectory (com- 
prising the atomic coordinates per time) was parsed to compute the center-of-mass for each 
of the two complexes. The starting center-of-mass distance was set to zero and the distance 
was re-computed at each time step relative to the starting distance. The force is output 
directly to the standard log file. 

The statistical package R was used for all further analysis and visualization. Each of the 
50 independent trajectories per mutant produced a fairly noisy force curve. To reduce the 
noise, we averaged each point with a 75 point interpolation window; the window ensured 
the maximum force was never cut off due to lost time steps (caused by averaging) near the 
beginning of the simulation. The two primary descriptive statistics we used were maximum 
interpolated force and total area under the interpolated curve. We used the pairwise.t.test 



function with the FDR p-value correction to test for significance. The ggplot (Wickham 



2009 ) package was used to generate most of the figures. 



6 



Results 



The MACVGP/hTfRl interface 

The MACVGP /hTfRl interface marks a particularly important and useful test system. There 
are several sites on both the human and viral protein known to affect the infectivity phenotype 
of MACV; virtually all of the important sites have been mapped by in vitro flow-cytometry 
based infectivity assays. Since cellular binding and invasion is required for viral infection, the 
biochemical properties of host-virus protein-protein interfaces have a direct impact on the 
fitness of viruses. As a consequence, infectivity assays can provide much needed information 
about the strength of the host- virus interaction. The MACVGP/hTfRl interface appears 
not to be dominated by one particular type of interaction (electrostatics, hydrogen-bonding, 
or van der Waals). In addition, much of the fusion domain of MACVGP is on a flexible loop, 
making many other computational techniques only marginally useful ( Gront et al.|2011 ). The 
complex nature of this interface represents a particularly difficult challenge for traditional 
computational analysis. 

To drive the system through equilibrium, we applied an external force to one of the 
proteins in the host/virus complex. To begin, we required an experimentally determined 
atomic model. Due to practical computational limits, the model system needed to be reduced 
to the smallest number of atoms that were capable of producing empirically meaningful 
results; in addition, the complex was subsequently modeled into a physiologically meaningful 
biochemical environment. To apply a driving force, one of the two proteins was anchored in 
3D space while a force was applied to the other. We then computed the force and distance 
at each time step to develop a representative dissociation curve for each mutant; each of our 
test statistics was derived from the force versus distances curves. 

We used the experimentally determined MACVGP/hTfRl structure (PDBID: 3KAS) 



(Abraham et al. 2010). The hTfRl has two mostly independent domains; one interacts 
with MACVGP while the other is membrane associated. The independent nature of the two 
domains meant we were able to remove the membrane-associated domain without significantly 
affecting the MACVGP/hTfRl interaction. Figure 2 shows a model of the initial structure 
and that of the pared structure. Although MACVGP has several glycosylated residues. 
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we opted to use the de-glycosylated protein for this study. The complexity of correctly 
parameterizing diverse sugar moieties is outside of the scope of this paper. Furthermore, 
although it is known that MACVGP is glycosylated, and some of those sugars contact hTfRl, 



the effect of those contacts in vivo is still unresolved (Abraham et al. 2010). 



Steered molecular dynamics approach 

We applied a method using steering forces in molecular dynamics simulations to investigate 
the effects of point mutations at the MACVGP/hTfRl interface. Conceptually, we use an 
applied force to pull the viral glycoprotein away from its human receptor and measure the re- 



sulting force. From the simulation, we can calculate physically relevant quantities (Isralewitz 



eraL]|2001bp[l ). As we are more interested in the phenotypic impact of interface mutations, 
rather than physically rigorous calculations, we opted for more biologically intuitive descrip- 
tive statistics (Table 1). 

To ensure a natural solution structure, we performed an initial equilibration phase to 
reduce the intrinsic crystallization distortions. First, we generated a physically realistic sim- 
ulation world around the protein complex; this environment included an explicitly modeled 
bath of water, and a physiologically relevant concentration of ions. Next, with a fixed a- 
carbon backbone we carried out successive rounds of energy minimization, followed by heat- 
ing, and backbone release. Last, with minimal restraints a short round of all-atom molecular 



dynamics (Cuendet and Michielin 2008) was performed. We used the final state from each 



equilibrated system to restart another MD simulation. For SMD, at the centroid of several 
atoms a force was applied to a single point. By using a known spring constant, and pulling 



with a constant velocity, we could calculate the applied force of dissociation (Cuendet and 



Michielin 2008 Cuendet and Zoete 2011). A typical averaged force curve comparison can 



be seen in Figure 1. The force curves for each mutant were smoothed either by averaging 
over many replicates as in Figure 1 or by selecting an interpolation window for averaging; 
the latter approach was used for statistical analysis. In addition, to minimize the compu- 
tational time in unproductive molecular dynamics, we shortened simulation time such that 
the interaction just barely dropped to zero. As seen in Figure 1, this distance was relatively 
consistent among mutants; supplement 1 illustrates the separation distance between peptide 
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domains. 



Comparative analysis of MACVGP/hTfRl interface 

Before systematically applying SMD to the MACVGP/hTfRl interaction, we needed to en- 
sure the method was sufficiently sensitive to distinguish between relatively minor point muta- 
tions. While SMD has been applied previously to measure the binding energy of high-affinity 



T-cell receptor interactions (Cuendet and Michielin 2008 Cuendet and Zoete 2011), to our 



knowledge the method has never been used previously to parse small differences in a com- 
plex energy landscape. For mutant comparison, we chose to use descriptive statistics rather 
than physically rigorous calculations. Several residues on both proteins appear to mediate 
putative hydrogen-bonding interactions. We selected three sites total; Tyr211 and Asn348 in 
the hTfRl and vArglll in MACV. For notation purposes, the viral site is always referred to 
with a preceding 'v'. We initially tested alanine substitutions congruent with the traditional 
experimental and computational approach. 

For the sake of reliability, we smoothed each of the force curves over a 75 time point 
interpolation window. In many cases, maximum force was sufficient to distinguish mutants 
(Table 2), and AUG was able to add a few more statistically significant differences (Table 
3). Relative to WT one alanine mutation produced a very large difference in the maximum 
force and area of the force curve (AUG) (Figure 1). Although only one point mutant could 
be resolved from the WT complex, among the array of substitutions we tested, two of the 
three sites displayed very large affinity changes. As more mutants are tested, clustering them 
into one of three classes may produce more consistent statistical results. 

Gonsidering the involvement of extended hydrogen-bonding networks, it was not clear 
that alanine mutations, even those that should destroy such networks, would significantly 
change the strength of interaction. One major advantage of first principles simulations is the 
ability to test mutations other than alanine without additional underlying assumptions in the 
energy function. We made additional mutations based on biochemical intuition or available 
experimental data to chemically diverse amino acids including tryptophan, lysine, aspartate, 
and threonine. Several mutations caused significant relative affinity changes. In addition, to 
detect synergistic effects, we tested several double mutants where both mutations appeared 
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to cause reduced binding; then, we compared the size of those differences to reduced binding 
single mutants (Figure 3). 

In total, we tested 7 point mutants and 3 double mutants in addition to the WT complex. 
The largest single mutant impact was the alanine mutation Tyr21lAla in the hTfRl {p < 
0.001). Via in vitro flow-cytometry infection assays or by known host-range limitations. 



several mutations at this site have proven capable of causing loss-of-infection (Radoshitsky 



et al.||2008[ |Choe et al.||2011[ [Radoshitsky et al.||201l] ). Most likely this effect is caused by the 



destruction of a critical hydrogen bond to Serll3 in the MACVGP; the lost hydrogen bond 
would lead to the subsequent loss of a large hydrogen-bonding network seen in the crystal 



structure (Abraham et al. 2010). In agreement with the static structure, our molecular 
dynamics simulations show Tyr211 is involved in a large hydrogen-bonding network (results 
not shown). 

Although Tyr211Ala appears to have a large impact on binding affinity, the single mu- 
tant provided little underlying biochemical rationale. Since alanine is both smaller than 
tyrosine and also incapable of participating in hydrogen-bond interactions, more mutations 
were needed to probe the critical biochemical difference. First, we substituted smaller side 
chains that, like tyrosine, were capable of hydrogen bonding; it is known that several substi- 



tutions at Tyr211 are the result of pressure on hosts in rodent populations (Radoshitsky et al. 



2008 Choe et al. 2011 Radoshitsky et al. 2011). Both Tyr21lAsp and Tyr211Thr proved 



incapable of causing reduced binding affinity; in fact, both mutations caused a significant 
increase in relative binding affinity (Figure 2). This is not surprising considering the fact 
that Tyr211 participates in a bonded network rather than independently. Combined with 
some biochemical intuition, these results imply that a mutation to other polar residues would 
not affect the dynamics of binding directly. Since it has been shown that MACVGP binding 



does little to affect hTfRl structure (Abraham et al. 2010), such effects are likely exerted 
upstream through translation, transport, or slight folding differences. 

We also simulated several point mutations at Asn348 in the hTfRl. Asn348Lys is reported 



in the literature to cause reduced infectivity in vivo (Radoshitsky et al. 2008 Abraham 



et al. 2010). Further, in the crystal structure two residues on the MACVGP and at least 



three on hTfRl are involved in a fairly extensive hydrogen-bonding network. Initially we 
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tested the alanine mutation which resulted in a slightly higher but not statistically significant 
difference from WT (Table 4). In addition, the Asn348Lys mutation, known to have a 
phenotypic effect, showed no significant difference from WT; similarly, Asn348Trp was not 
significantly different from WT. On the other hand, there was a detectable difference between 
Asn348Ala and Asn348Lys (Table 2 and Table 3), with Asn348Lys being a weaker binder; 
also, Asn348Trp showed nearly identical results to Asn348Lys. It seems, in contrast to 
Tyr211, the critical difference at position 348 in hTfRl is caused by size exclusion rather 
than by the destruction of a hydrogen-bonding network; this can be deduced by noting that 
Asn348Trp and Asn348Lys produce nearly identical affinity changes (Table 4). To check the 
consistency of our results, we hypothesized the combination of Tyr21lAla and Asn348Trp, 
being chemically disconnected in two different hydrogen bonding networks, would lead to a 
synergistic loss-of-binding. As expected, the double mutant was the weakest binding mutant 
tested {p < 10^^) (Table 2 and Table 3) in this study. 

Last, we tested a single mutation in the MACVGP, vArglllAla. It has previously been 



reported that vArglllAla causes a loss-of- infection during in vitro infectivity assays (Ra- 



doshitsky et al. 2011). While the data shows a lower maximum force and total energy, none 



of our test statistics showed a significant difference between WT and Arg21lAla (Table 2 and 
Table 3). However, in combination with Tyr21lAla, vArglllAla in the MACVGP appeared 
capable of suppressing the loss-of-binding caused by Tyr21lAla. Although unexpected, this 
result may not be so surprising since we were replacing a hydrogen bonding network with a 
hydrophobic interface. 



Discussion 

We have applied a method utilizing steering forces in all-atom molecular dynamics simulations 
to evaluate the effects of mutations at the MACVGP /hTfRl interface. We modeled mutations 
at several sites in the MACVGP/hTfRl interface, and verified our computational protocol 
was sensitive enough to distinguish point mutants in hTfRl; further, we identified several 
test statistics that may be important parameters for infectivity. We systematically tested 
several point mutations to understand their contribution to the binding interaction. In 
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the case of Asn348Lys, the loss-of-infectivity seen in vivo could not be easily inferred from 
the static structure; the critical difference for mutations at Asn348 lies in size and charge 
restriction. Further, we have shown the cause of diminished infectivity in the viral single- 
mutant vArglllAla is probably not due to binding energy differences. We also found that 
a negatively polar residue at site 211 in hTfRl is absolutely critical for a tight-binding 
interaction; any non-polar mutation at Tyr211 in hTfRl is likely to completely halt viral entry 
and dramatically decrease the chances of MACV infection. However, our data suggest the 
viral mutation vArglllAla may restore the infective potential of MACV that was previously 
destroyed by a Tyr21lAla hTfRl mutation. 

Making predictions regarding the effects of non-alanine mutations at protein-protein in- 
terfaces has thus far met with little success. There are several major obstacles that have 
contributed to this failure. One issue is the enormous number of chemically allowed three 



dimensional configurations in proteins (Gumbart et al. 2012). There are perhaps three, but 
more likely two, broadly useful computational approaches to make predictions about the 
relative strength of protein-protein interactions. First, to be physically rigorous one can 



try equilibrium all-atom simulations from physical first principles (Gumbart et al. 2012). 
The most straightforward approach would be to define the fundamental physical equations 
governing the system of interest, then simulate for a sufficiently long time period. After 
equilibrium is reached, one could compute a binding constant by observing the fractional 
association/dissociation time in a single long simulation. Unfortunately, due to limited com- 
putational power and available methods, reaching equilibrium with a simulated protein sys- 
tem is truly a herculean task and verifying it is essentially impossible ( Gumbart et al.|[2012 ). 
Second, to reduce computational complexity several studies have used rigid body docking 
and machine learning ( Vreven et al.||2011 2012 Bajaj et al.||2011 Hwang et al.||2010 ). Unfor- 
tunately, due to their dependence of available data, these methods often suffer in applications 
to novel systems. Third, there are also several hybrid approaches exploiting non-equilibrium 
calculations within the context of all atom simulations (Isralewitz et al. 2001b||a Park and 



Schulten]|2004 Gront et al.|2011 ). Many of these methods have either not yet been applied to 



protein-protein interactions due to practical limitations or produce inconsistent results when 
larger structural rearrangements are critical for affinity changes. 
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By extending the SMD technique to investigate a protein-protein interactions we have, 
in many ways, changed the scope of its apphcation. Traditionally this technique is either 



applied to compute equilibrium quantities via a non-equilibrium approximation (Park et al. 



2003 Park and Schulten 2004; Buch et al. 2011 Giorgino et al. 2012), used to estimate 



protein stability through unfolding ( Lu and Schulten||1999 ), or used to calculate the absolute 
free energy of small molecule ligand- binding (Dixit and Chipot 2001). Likewise, others have 
used SMD to understand the process of binding and unbinding at a resolution unmatched by 



experiment (Cuendet and Zoete 2011 Giorgino et al. 2012). And still others apply driving 
forces to untie polypeptide knots in toy systems ( Sulkowska et al.|2010 ). Here, we have shown 
SMD can provide some insight into the relative strength of protein-protein interactions; in 
contrast to more traditional absolute affinity calculations, relative binding is much more 
useful for practical application. Further, via whole cell infectivity assays it is often difficult 
to parse the various biophysical contributions to infectivity. Our method can eliminate at 
least one of the usual suspects (direct biochemical interactions), opening other avenues for 
subsequent experimental investigation. 

Our findings rationalize several effects observed in both infectivity data and rodent pop- 



ulations (Radoshitsky et al. 2008 Choe et al. 2011). We found that some substitutions at 
positions 211 and 348 did negatively affect the strength of viral binding. In the two cases, 
the computational data suggests the reason and nature of the effects are very different. Con- 
gruent with infectivity data, mutations at Tyr211 often have a dramatic effect on receptor 



binding (Radoshitsky et al. 2008; Choe et al. 2011). Tyr21lAla alone produces such a large 
decrease in affinity that it can be easily distinguished from the WT complex. Structural 
analysis reveals the biochemical cause involves the destruction of a critical hydrogen bond- 
ing network with Serll3 and vArglll in MACVGP. Further, Tyr21lAsp and Tyr211Thr 
were easily resolved from the WT complex, but descriptitve statistics suggest they cause 
an increase in affinity. As a result, we predict experimental structural analysis will show 
Tyr211Asp and Tyr211Thr mutations cause a relatively dramatic tertiary structure devi- 
ation. Given that our model starts from an associated complex, it is not surprising that 
Tyr21lAsp would cause increased rather than decreased affinity; aspartate is charged rather 
than polar at physiological pH and its shorter side chain means the proteins would be pulled 
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in closer apposition. Nevertheless, our results show Tyr211, or some other amino acid that 
is capable of hydrogen bonding, is necessary for a stable hydrogen bond network leading to 
efficient binding. Moreover, the cause of decreased infectivity seen in rodent populations 
with the Tyr21lAsp mutation is not a simple change in binding affinity. 

Previous work has shown the hTfRl mutation Asn348Lys is capable of producing de- 



creased infectivity (Radoshitsky et al. 2008 Choe et al. 2011). Moreover, the molecular 
model based on x-ray diffraction data suggests destruction of a second hydrogen bonding 



network as the cause (Abraham et al. 2010). In our simulations no single mutation at 
Asn348 could be distinguished from the WT complex. Moreover, the mutation Asn348Ala 
resulted in an average maximum force that was virtually identical to WT hTfRl (Table 4); 
thus, the putative hydrogen bonding network involving Asn348 is probably a crystallization 
artifact. Even when paired with the highly destructive Tyr21lAla mutation, Asn348Ala was 
refractory to changes in binding affinity. By contrast, when Tyr211Ala was paired with large 
(Trp) and positively charged (Lys) substitutions at position 348, the result was a larger than 
expected synergistic difference. That is, the double mutant Tyr21lAla/Asn348Trp caused 
a much larger decrease in binding than we expected from either mutation individually. Al- 
though Asn348Trp and Asn348Lys individually failed to be distinguishable from WT hTfRl, 
they both caused a lower mean max force and produced larger than expected synergistic 
effects. This data suggests site 348 is size and charge restricted; any mutation to a very large 
or positively charged side chain will likely produce decreased infectivity. 

Although our computational predictions largely agreed with the available literature, there 
was one significant deviation; the mutation vArglllAla was indistinguishable from the WT 
complex. Other than binding, there are several reasons why a viral mutation may cause a 
phenotypic difference in infectivity. First, across disconnected experiments there could be 
a gross difference in the MACVGP concentration. Although western blots are a popular 
means of protein concentration normalization, when used in a quantitative manner they 
are notoriously difficult and unreliable. In addition, though western blots may show gross 
differences in protein concentration, viral binding may be very sensitive to small differences 
in fusion protein concentration. Second, even if we trust protein expression normalization via 
western blot, it is not clear what fraction of protein is grossly or microscopically misfolded; 
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an additional analytical step with circular dicliroism or an analogous technique may further 
clarify large-scale folding differences. Third, since we start with a bound structure, any 
changes that may dramatically affect the rate of association (different folds, trafficking issues, 
etc.) would be underestimated by our method. Although the single viral mutant vArglllAla 
did not produce any noticeable effects, we thought its binding orientation near Tyr211 and 
involvement in the critical hydrogen bond network may make it capable of producing a 
suppressor mutant. Although Tyr21lAla was the most disruptive single mutant we tested, 
vArglllAla in the MACVGP was able to restore mean maximum force to WT levels (Table 
1). As mutations at liTfRl site 211 are known to reduce infectivity, vArglll in MACVGP 
will probably serve a supressor site for compensatory MACV mutations. Moreover, this site 
may be critically important for surveillance in the rodent population. 

As with any structural approach, we are constrained to start our computational model 
system from a static structure. One unfortunate reality in structural biology is poor resolu- 
tion/discrimination at particularly long and flexible amino acid side chains. In addition, the 
number of acceptable side chain rotamers increases in large amino acids; as a consequence 
there could be many untested starting conformations. Although we believe we have min- 
imized unsampled conformations here, such an issue could be overcome with a sufficiently 
large number of SMD starting models. Of course, increasing sampling in this case comes 
at significant computational cost. There are a few additional challenges for investigating 
host- virus interaction via molecular dynamics simulation. As with any atomistic simulation, 
there is going to be a fairly large noise to signal ratio. To reduce noise, one could further cus- 
tomize each simulation. Furthermore, larger amounts of computational resources will have a 



direct and powerful impact on the strength of any atomistic study (Jensen et al. 2012); such 
resources could come in the form of increased compute time, improved code, or customized 
hardware for floating point operations ( Shaw et al.|2011 ). With improved resources, we could 
investigate thousands of individual permutations in the MACVGP/hTfRl binding interface. 
Going forward, all-atom molecular dynamics simulations can become an increasingly viable 
approach for protein interaction studies. 

We have probed the biochemical effects of mutations at the host-virus interface between 
MACV and its human receptor. We applied an approach that imparts forces in an all-atom 
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molecular dynamics (MD) simulation to drive the system through equilibrium. Our work has 
shown that with relatively modest computing resources it is possible to use MD simulations 
to investigate point mutations in the MACVGP/hTfRl interface. In most double 
mutant was easily sufficient to resolve a difference in relative affinity; moreover, relative to the 
WT complex one point mutant was capable of producing a large decrease in binding affinity. 
We applied our technique to the WT complex and 10 mutations in the MACVGP/hTfRl 
interface. We were able to rationalize the results of infectivity data with fundamental physical 
simulations. Our data has provided some direction for further biochemical intuition in this 
system. A polar residue is absolutely critical at position 211 in liTfRl; any mutation to a 
nonpolar residue will probably lead to a rapid viral compensatory response. Also, although 
rodent populations may produce further mutations at Asn348 in hTfRl, unless a mutation 
is positively charged or very large it is unlikely to result in a large disruption to viral fusion. 
We have also provided both a site and mutation that will probably be the next step in the 
evolutionary arms race between mammals and MACV. Finally, our analysis offers a new 
means to systemically investigate host-virus interactions through computational molecular 
dynamics. 
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Figure 1: Force versus distance curve of Y211A mutant and WT. The average force curve for 
50 replicates of the WT complex was black, and the average of 50 replicates of the Y211A 
mutant was red. Notice there is a very large difference in maximum force and area under the 
curve between the two complexes. 
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Figure 2: Comparison of MACVGP/hTfRl complex. Figure A shows the full, de-glycosylated 
MACVGP/hTfRl co-crystal structure. For figure B, we show the reduced structure for SMD. 
MACVGP is blue and hTfRl is green. 
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Figure 3: Density plot of three different MACVGP/hTfRl complexes. The WT MACVGP- 
hTfRl complex in the middle is flanked by the tighter binding mutant Y211D on the right and 
the weaker binding double mutant N348W/Y211A on the left. The large non-overlapping 
areas indicate a large and statistically significant difference in these three complexes with 
relatively minor differences. 
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Table 1: Summary statistics for each mutation tested. jJ^MaxForce is the mean and (JMaxForce 
is the standard deviation of maximum force over all simulations, i^auc is the mean and (Tauc 
is the standard deviation of area under the curve over all simulations. 



Mutation 


I^MaxForce 


(^MaxForce 


^^Auc 


<^AUC 


WT 


772.00 


154.13 


198794.7 


66765, 


.64 


N348A 


799.33 


154.36 


202247.3 


65175, 


,78 


N348K 


731.36 


151.01 


179087.5 


55799, 


,49 


N348W 


732.13 


131.21 


192043.2 


66916, 


,36 


vRlllA 


754.39 


103.25 


185896.8 


46408, 


.74 


Y211A 


684.03 


126.69 


153517.7 


40889, 


.12 


Y211D 


875.44 


116.46 


234909.3 


50494, 


.11 


Y211T 


861.17 


158.48 


231078.7 


97569, 


.29 


N348A/Y211A 


762.96 


144.81 


173059.2 


59898, 


.10 


N348W/Y211A 


617.21 


163.33 


147705.6 


48548, 


,61 


VR111A/Y211A 


780.71 


135.65 


189919.1 


48868, 


,16 
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Table 2: Pairwise difference p— values for interpolated maximum force. Bolded values are 
statistically significant at p < 0.05. 



WT N348A N348W N348K Y211A Y211D Y211T vRlllA N348A/Y211A N348W/Y211A 

N348A 0.35 



N348W 


0.22 


0.012 












N348K 


0.22 


0.012 


0.98 










Y211A 


0.0045 


1.6x10-* 


0.14 


0.15 








Y211D 


0.00083 


0.0045 


3.9x10"" 


3.9x10"" 


5.7x10-'° 






Y211T 


0.0042 


0.022 


3.0x10-"= 


3.0x10-"= 


1.0x10-"* 


0.67 




vRlllA 


0.59 


0.11 


0.51 


0.51 


0.024 


8.9x10-"'' 


0.00056 


N348A/Y211A 


0.78 


0.20 


0.35 


0.35 


0.011 


0.00026 


0.0016 0.78 


N348W/Y211A 


6.2X10-'"' 


9.3x10"'^ 


0.00021 


0.00026 


0.032 


< 2x10-'" 


2.4x10-'= 9.1x10-"" 2.8x10 


VR111A/Y211A 


0.78 


0.52 


0.14 


0.14 


0.0019 


0.0024 


0.010 0.44 0.59 



Table 3: Pairwise difference p— values for interpolated area under the force curve. Bolded 
values are statistically significant at p < 0.05. 





WT 


N348A 


N348W 


N348K 


Y211A 


Y211D 


Y211T vRlllA 


N348A/Y211A 


N348W/Y211A 


N348A 


0.77 


















N348W 


0.67 


0.42 
















N348K 


0.17 


0.056 


0.38 














Y211A 


0.00094 


3.5x10-°= 


0.0045 


0.063 












Y211D 


0.0072 


0.0051 


0.0016 


4.6x10-™ 


1.2x10-"° 










Y211T 


0.016 


0.014 


0.0041 


0.00016 


6.1x10-°° 


0.77 








vRlllA 


0.38 


0.18 


0.68 


0.67 


0.016 


0.00032 


0.00094 






N348A/Y211A 


0.061 


0.013 


0.18 


0.68 


0.17 


5.2x10-°° 


2.1-°= 0.38 






N348W/Y211A 


0.00017 


3.9x10-°* 


0.0011 


0.021 


0.68 


1.6x10-'° 


6.3x10-'° 0.0047 


0.063 




VR111A/Y211A 


0.56 


0.34 


0.86 


0.47 


0.0072 


0.0010 


0.0025 0.77 


0.24 


0.0019 



24 



Table 4: Pairwise differences in interpolated maximum force. Bolded values are statistically 
significant at p < 0.05. A Max Force is the mean max force for the first mutant listed in a 
row minus that of the second mutant listed. 



Mutant Pair 


A Max Force 


Mutant Pair 


A Max Force 


N348A-WT 


27.33 


Y211D-N348K 


144.08 


N348W-WT 


-39.87 


Y211T-N348K 


129.81 


N348K-WT 


-40.64 


VR111A-N348K 


23.03 


Y211A-WT 


-87.97 


N348 A / Y2 1 1 A-N348K 


31.60 


Y211D-WT 


103.44 


N348W/Y211A-N348K 


-114.15 


Y211T-WT 


89.17 


vRl 1 1 A/ Y2 1 1 A-N348K 


49.35 


vRlllA-WT 


-17.61 


Y211D-Y211A 


191.42 


N348A /Y21 1 A-WT 


-9.04 


Y211T-Y211A 


177.15 


N348 W / Y2 1 1 A-WT 


-154.79 


vRlllA-Y211A 


70.36 


vRlllA/Y211A-WT 


8.71 


N348A/Y211A-Y211A 


78.93 


N348W-N348A 


-67.20 


N348W/ Y2 1 1 A- Y21 1 A 


-66.82 


N348K-N348A 


-67.97 


vRl 11 A / Y2 1 1 A- Y21 1 A 


96.68 


Y211A-N348A 


-115.30 


Y211T-Y211D 


-14.27 


Y211D-N348A 


76.12 


vRlllA-Y211D 


-121.05 


Y211T-N348A 


61.85 


N348A/Y211A-Y211D 


-112.48 


vRlllA-N348A 


-44.94 


N348W/Y211A-Y211D 


-258.23 


N348 A / Y2 1 1 A-N348 A 


-36.37 


VR111A/Y211A-Y211D 


-94.74 


N348W / Y2 1 1 A-N348 A 


-182.12 


VR111A-Y211T 


-106.78 


vRl 1 1 A/Y21 1 A-N348 A 


-18.62 


N348A/Y211A-Y211T 


-98.21 


N348K-N348W 


-0.77 


N348W / Y2 1 1 A- Y2 1 IT 


-243.96 


Y211A-N348W 


-48.10 


vRl 1 1 A / Y2 1 1 A- Y2 1 IT 


-80.46 


Y211D-N348W 


143.31 


N348A/Y211A-VR111A 


8.57 


Y211T-N348W 


129.04 


N348W/Y211A-VR111A 


-137.18 


VR111A-N348W 


22.26 


VR111A/Y211A-R111A 


26.31 


N348A/Y211A-N348W 


30.83 


N348W/Y211A-N348A/Y211A 


-145.75 


N348W / Y2 1 1 A-N348 W 


-114.92 


vRl 1 1 A/ Y2 1 1 A-N348A / Y2 1 1 A 


17.75 


vRl 1 1 A/Y21 1 A-N348 W 


48.58 


VR111A/Y211A-N348W/Y211A 


163.50 


Y211A-N348K 


-47.33 
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